BLI single-tip method data analysis example - work in progress

intro

This jupyter notebook walks you through how to analyze BLI data generate using the "single tip method", where a single tip is used repeatedly to measure binding across a range of titrant (species not on the tip) concentrations.

To complete the analysis, run all cells in order, filling in or changing input values as needed. To export a plot made using pyplot, add the plt.savefig('filename.jpg',dpi=200) to the cell in which it was generated.

Requirements: This notebook requires Python3, and the following modules:

If you have anaconda, you probably already have these modules except for lmfit. Install using conda install -c conda-forge lmfit

Required experiment-specific files

On the BLI instrument, export raw data file. It will be saved as something like RawData0.xls. If there are multiple raw data files, you can easily merge them as long as all of the data have different "tip well IDs". These are the wells in which the tips are stored prior to the experiment (see sensorplate image) SensorPlate-2.jpg
You will then need to make a sample_key.csv file. I make the file in excel and the save it as a csv. The sample_key.csv file just defines the name of the species on the tip (tip) and name of the species being titrated (titrant) for each "tip well ID". TODO: explain this better

example of what sample_key.csv should look like:

tip well tip titrant
A6 pCare INV
B6 pCare INV
C6 pCare INV
A7 empty INV
B7 empty INV
C7 empty INV
D8 Lpd3 EVH1 only
D9 empty EVH1 only
D10 SHIP2 EVH1 only

script idea/pipeline overview

I wrote this script with the intention of it being modular - meaning you can look at and plot your data in different ways and deal with complicated plate setups dynamically. So you can plot subsets of the data in stages, decide which samples to calculate binding curves for, process samples with different loading times and different methods, etc.
Therefore, for now, it requires you to examine the data and make decisions instead of just pressing go. If you have very reliable/normal looking data that always looks the same, it would be fairly easy to turn this into an easy to run script.

Basic pipeline: most of the functions for processing the data are in the file BLI_tools.py. I use pandas heavily so I use a dataframe for everything more or less

  1. import the sample_key as a dataframe
  2. import the data as a dataframe with BLI_tools.import_raw_data()
  3. create a BLI_data = BLI_tools.BLI_data_df() object using the sample key and raw data as inputs
  4. you can plot the raw data with BLI_data.pyplot_plot_samples() or BLI_data.plotly_plot_samples() at any point. I like to use BLI_data.plotly_plot_samples() and save and .html file of all the raw data at first so that I can look at it interactively later.
  5. You can then use any of the BLI_data.set...() methods to select a subset of the data in different ways. There is a method for setting wells based on tip/titrant names, enumerating through well IDs, or setting the wells directly. (see below for more details)
  6. If you want you can also baseline normalize the data using BLI_data.zero_cols(). It won't be relevant for the binding signal calculation however
  7. To calculate binding curves, you first have to use BLI_data.set_assay_times() to set experimental parameters used for calculating the binding signal.
  8. then you can preview how the binding signal will be calculated with BLI_data.binding_signal_preview()
  9. Finally, you can calculate a binding curves with BLI_data.generate_binding_curves2() function and further process that however you would like.
    TODO: add method to do curve fitting and empty subtraction. output binding curve that is merged with sample_key to provide more info about samples

    Written by: Jackson Halpin, July 2021

example overview

For the data here, I will be working with multiple datasets collected with the same plate, on the same date.

Part 0 - preparation and initial plotting

Imports

import and merge datafiles

We have multiple datafiles, so I am just going to make a datafiles list and import/merge all of the datafiles into 1 dataframe.

plot all the data and save as html

plot all the data and export as an html using plotly. you can open this file in a web browser later and look at the data interactively


Lesson: setting sample wells

you can set the sample wells to use using different methods:

Once you set sample wells, any subsequent methods run with the data object will only use the wells that you set (see example below)
use data.sample_wells to see the currently set samples

Example

Setting the sample_wells does not remove any data from the object, it just designates which wells to use in subsequent methods.
Example - we can still set different wells after we've already ran the cell above:


Part 1 - "unfittable" data

Let's make plots of just the INV data and save them as images

baseline normalization

use the data.zero_cols() function to baseline normalize the data. Again, this doesn't destroy the original data (data.df), it creates a new normalized dataset (data.baseline_subtracted_data). Baseline normalizing the data is irrelevant for calculating binding curves with the way that we are calculating them, however you sometimes want to normalize for plotting.

Part 2: binding curves

set concentrations and method time steps

Let's look at this data first. We need to find the time point which we want to define as our first binding signal (at the end of the first association phase). Since the 3 different peptides have different loading times, we will have to process each titration individually, using a different initial offset time for each

one at a time - D8

Set the first_association_time for the first sample and use the data.binding_signal_preview() function to get a look at how the signals are being calculated

The binding signal is calculated as:
(average value within the red shaded region) - (ave value within subsequent grey region)
See data.set_assay_times() documentation for more details

one at a time - D9

one at a time - D10

Subtract the Empty signal from the other signals

fit and plot binding curves